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f^ Abstract 

The last years have seen a growing interest in collaborative systems like electronic marketplaces and P2P 
Ch file sharing systems where people are intended to interact with other people. Those systems, however, 

are subject to security and operational risks because of their open and distributed nature. Reputation 
systems provide a mechanism to reduce such risks by building trust relationships among entities and 
^i identifying malicious entities. A popular reputation model is the so called flow-based model. Most 

existing reputation systems based on such a model provide only a ranking, without absolute reputation 
values; this makes it difficult to determine whether entities are actually trustworthy or untrustworthy. 
In addition, those systems ignore a significant part of the available information; as a consequence, 
reputation values may not be accurate. In this paper, we present a fiow-based reputation metric that 
gives absolute values instead of merely a ranking. Our metric makes use of all the available information. 
!/3 We study, both analytically and numerically, the properties of the proposed metric and the effect of attacks 

on reputation values. 



> 1 Introduction 

O 

CN The advent of the Internet has brought new business opportunities and favored the development of col- 

"j* laborative environments. In particular, the Internet provides the basis for the development of electronic 



communities where strangers interact with each other and possibly do business. However, these interac- 
tions involve risks. For instance, in an eCommerce setting, buyers are vulnerable to risks due to potential 
incomplete or misleading information provided by sellers 1,34,1 . Similarly, sellers are subject to the risk that 
the counterparty in a transaction will be unable to honor its financial obligations. To mitigate those risks, 
there is the need of a decision support system that is able to determine the trustworthiness of collaborative 
parties. 

Reputation systems are widely considered as 'the solution' to assess trust relationships among users and to 
identify and isolate malicious users [28 1. Reputation systems are currently adopted in commercial online 
applications such as P2P file sharing |8|, web search |i9J, electronic marketplaces |3, 15 1, and expert systems 
Il2][l6l. Reputation is a collective measure of trustworthiness based on aggregated feedback related to past 
experiences of users. The basic idea is to let users rate each other and to aggregate ratings about a given user 
to derive a reputation value. This value is then used to assist other users in deciding whether to interact 
with that user in the future ETIl . In the last years, a number of reputation systems have been proposed 
to aggregate ratings and calculating reputation values; each system is based on a particular theoretical 
foundation (see |fT8ll2TI for a survey). 

The quality of a reputation system is determined by how accurately the computed reputation predicts the 
future performance of entities fTS]. This, however, is difficult to achieve because some users can attempt 
to manipulate their reputation and the reputation of others for their own benefit. Most existing reputation 
systems lack the ability to discriminate honest ratings from dishonesty ones. Therefore, such systems are 
vulnerable to malicious users who provide unfair ratings f34l. 

The issue of discriminating honest from dishonest ratings is usually addressed by reputation systems using 
the so called flow model 11211 as the mathematical foundation. Examples of such systems are EigenTrust 
|l23l, PageRank f9l, SALSA fSS^, and PeerTrust f34l. What makes them appealing is that reputation is 
computed taking into account the feedback of all the users involved in the system, and the feedback is 
weighted with respect to the reputation of the user providing the feedback. Flow models are often based 
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Figure 1: Example scenarios. 1000 ratings given to Bob, Charlie and David. 



on the theory of Markov chains. The feedback provided by the users is aggregated and normaUzed in order 
to obtain a Markov chain. Thereby, starting from a vector of initial reputation values, Markov steps are 
repeatedly applied until a stable state has been reached. 

Unfortunately, the current state of affairs regarding this kind of reputation model is not very satisfactory. 
First of all, those systems only provide a ranking of users rather than an absolute reputation value. Although 
this can be acceptable in some applications like web search, it is not in others like electronic marketplaces. 
For instance, a buyer prefers to do business with an honest seller rather than with the most trustworthy 
one in a pool of dishonest sellers. In a scenario where health care providers are willing to use data created 
by patients |32|, the quality of the data provided by a patient cannot be assessed by only looking if he 
is more capable than other patients to do good measurements; an absolute quality metric is required. In 
addition, most of the flow-based systems ignore a significant part of the available information (e.g., negative 
feedback). Consequently, the reputation values those systems return may be inaccurate. 
To illustrate these points, let us consider an electronic marketplace where users can rate each other after 
each transaction, as in eBay ifTSl . Here, each time Alice has a transaction with another user j (e.g.. Bob, 
Charlie, David), she may rate the transaction as positive, neutral, or negative. Let us consider scenarios 
(a) and (b) in Fig. [T] From a reputation metric we would expect that Alice has almost neutral opinion of 
Bob and Charlie and negative opinion of David in (a), and positive opinion of Bob and Charlie and neutral 
opinion of Charlie in (b). However, if we apply the reputation metric proposed in |23| to these scenarios, 
we have that in both (a) and (b) Bob has local trust valuq^O.l, Charlie 0.9 and David 0. The formulas used 
to compute these values will be presented in Section [2] Here, we just want to point out that the metric in 
||23J is unable to distinguish between cases (a) and (b). This lack of distinguishing power can be risky for 
users as it can mislead them in their decision whether to do business with other users. For instance, the 
reputation value of Charlie computed using the metric in |23| in (a) can lead users to think that Charlie 
is 'very' trustworthy while in fact he is not. These reputation values only indicate that Charlie is more 
trustworthy than others (i.e., a ranking without an absolute scale). 

Moreover, it is worth noting that in |23 1 negative ratings are discarded in order to obtain a Markov chain. 
Consequently, it is not possible to distinguish between users that have (strong) negative reputation and 
users that have neutral reputation. This can be observed by comparing the ratings received by David in 
scenarios (a) and (b) of Fig. [T] although David received a large number of negative ratings and no positive 
ratings in (a) and an equal number of positive and negative ratings in (b), his reputation value is equal to 
in both scenarios. 

Last but not least, the design of flow-based reputation models requires including a number of parameters 
which intend to guarantee the convergence of computations. However, a comprehensive and exhaustive 
study of the impact of such parameters on reputation values and how they can be used to protect the 
systems against attacks has not been conducted yet. 



Our contributions In this paper, we present a reputation metric that enhances existing flow-based rep- 
utation metrics (see 191 123] |3T|) by providing absolute values instead of merely a ranking, and by not 
discarding any available information. Computing absolute reputation values makes it possible to quantify 



'in 1 23 1 local trust values indicate the opinion that users have of other users based on past experiences. Local trust values are in 
the range [0, 1]. 



the trustworthiness of users and therefore provides a measure to univocally compare reputations values. 
This allows us, for instance, to distinguish cases (a) and (b) in Fig. [T] In the design of our reputation met- 
ric, we study the effect of self-reference (i.e., a user who gives feedback to himself). We demonstrate that 
our construction minimizes such an effect, leading to reputation values that are closer to intuitive expecta- 
tions. We formally prove that the proposed reputation metric always has a solution, and that the solution is 
unique. We also discuss several methods of solving the reputation equation numerically. 
Our metric depends on a number of parameters: a pattern matrix, which stores the (aggregated) feedback 
received by the system owner from the users about the interactions they had with other users (hereafter, 
the pattern matrix is also called indirect evidence matrix), a starting reputation vector, which represents the 
direct information known to the system owner about the trustworthiness of entities in the system, and an 
interpolation parameter a, which serves as a weight for direct versus indirect information. We analytically 
study the impact of changes in the indirect evidence matrix on reputation values. This study allows us 
to analyze how someone can attack the reputation system by providing unfair ratings. In particular, we 
analyze self-promoting and slandering attacks|18| as well as Sybil attacks|141. To study self-promoting 
(slandering) attacks, we assume that an attacker can manipulate reputation values by giving positive ratings 
to users who gave positive ratings to him (negative ratings to the target) and negative ratings to users 
who gave negative ratings to him (positive ratings to the target). We study the effect of Sybil attacks by 
modeling an attacker who subverts the reputation system by first creating a large number of pseudonymous 
entities, and then using them to influence the reputation value of a target user in a similar way as is done 
for self -promoting and slandering attacks. 

On the other hand, we assume that the starting reputation vector and the weight parameter are defined by the 
system owner and cannot be modified by the attacker. We numerically study the impact of these parameters 
on reputation values and analyze how they can be used to mitigate the effect of above mentioned attacks. 
The analysis allows us to draw some guidelines for choosing the value of these parameters. The guidelines 
are general and apply to reputation metrics that use similar parameters. 

In this work we are mainly interested in the study of the mathematical model of reputation systems, rather 
than in the algorithm implementing the mathematical model. Therefore, we assume throughout the paper 
the existence of a central authority which collects all ratings and calculates the reputation of every partic- 
ipating user. This assumption is in line with the approach proposed in |9| where a search engine collects 
information about hyperlinks of several million pages and indexes search results on the basis of such an 
information. 

The paper is structured as follows. Section [2] provides an overview of reputation systems. Section [3] 
presents our metric and Section |4] discusses its formal properties. Section |5] discusses several methods 
for computing the reputation vector. Section [6] evaluates reputations numerically for a number of attack 
scenarios. Section|7]concludes and discusses directions for future work. 

2 Reputation Systems 

Reputation systems have been proposed as a mechanism for decision support in open collaborative systems, 
where entities do not know each other a priori. Reputation is a collective measure of trustworthiness built 
from user experience. A user's experience consists of the events observed by that user. Events can be, for 
instance, voiced opinions, that is opinions that are made public f3T|, downloads ["231, or transactions ifTSl . 
Users can rate the behavior of other users on the basis of their experience. In particular, ratings represent 
direct judgments of the behavior of users with respect to the perspective of the judging user. Those pieces 
of evidence are aggregated in order to calculate the reputation of users. Reputation gives the extent to 
which the target's behavior is good or bad |i4J. 

In ifTSi Hoffman et al. identify three dimensions of reputation systems: formulation (the mathematical 
model), calculation (the algorithm implementing the model and actually computing reputation), and dis- 
semination (the mechanism to disseminate the outcome). Here, we mainly focus on the formulation dimen- 
sion, and on attacks on the mathematical model. The formulation of a reputation system includes a number 
of aspects: information source, information type, temporal aspects, and reputation metrics. The source of 
information can be subjective, i.e. the rating is based on subjective judgment like in ifTSllSTl . or objective, 
i.e. the rating is determined from formal criteria like in |l9l- The advantage of using objective information 



is that its correctness can be verified by other entities; however, sometimes it is difficuh to define formal 
criteria that fully capture entities' opinions. At the same time, subjective information makes it difficult to 
protect the system against unfair rating, which lies at the basis of self-promoting and slandering attacks (see 
lITSi ). A typical example of these attacks is the so called Sybil attack (see IT4l ). in which different entities 
or multiple identities held by the same entity collude to promote each other Another aspect of information 
sources is observability. Here it is important whether the information is directly observed by the entity 
calculating the reputation, or it is obtained second-hand or inferred from direct information. We call the 
reputation value calculated from directly observed information direct reputations] Indirect information is 
widely used in reputation systems to support a notion of transitivity of trust (see [9^, 13 , 22 , 23 1). Although 
trust is not always transitive in real life ifTTl . trust can be transitive under certain semantic constraints ll22l . 
In this paper we assume that ratings have the same trust purpose (i.e., the same semantic content) and 
therefore their aggregation is meaningful. We also do not distinguish between functional trust (i.e., the 
ability to make a judgment about a transaction) and referral trust (i.e., the ability to refer to a third party). 
As in fST, '341, we assume that a user trusts the opinion of users with whom he had positive transitions, 
since users who are honest during transactions are also likely to be honest in reporting their ratings. 
The type of information used by a reputation system has a considerable impact on the types of attack to 
which the system is vulnerable. Some reputation systems (see ifTSl l23l ) allow users to specify ternary 
ratings (positive, neutral, negative); others allow only positive ||9l [3T1l or only negative ratings. Although 
systems that only consider positive values are robust to slandering attacks, they are not flexible enough to 
discriminate between honest and malicious entities. Negative reputation systems are particularly vulnera- 
ble to whitewashing attacks 1 18 1; entities who receive a large number of negative ratings can change their 
identity and re-enter the system with a fresh reputation |25 1. Therefore, one of our requirements for repu- 
tation systems is that entities should not be able to gain an advantage from their newcomer status. At the 
same time, newcomers should not be penalized for their status. Here, the temporal aspects of a reputation 
system play a fundamental role. For instance, some systems (see ||9]ll9l|23][3T]) do not distinguish between 
recent and past behavior, whereas other systems (e.g., see HITS] EH) give more weight to recent behavior. 
For instance, in [4] reputation values are updated by aggregating the previous reputation value with a factor 
indicating the proximity of the recent score to the past reputation, i.e. r^- ■ — r\- ' + ii{dij, r^^ ~ '), where 
/i is a function that determines how fast the reputation value rij changes after an event with rating dij . 
A reputation metric is used to aggregate ratings and compute reputations. Several computation models have 
been used: simple summation or average of ratings |l3]|T5][l6l, Bayesian systems ll20l[30l . beta probabiUty 
density f32|, discrete trust models |10|, belief models fT, 191, fuzzy models fT, "291, and flow models 
||9, 23 , 26 , 33, 31| . Flow models are particularly interesting as they make it possible to compute reputation 
by transitive iteration through loops and arbitrary chains of entities. Here, we present the reputation system 
proposed in fSSl as an example of a flow-based reputation system. Each time user i has a transaction with 
another user j, she may rate the transaction as positive {dij = 1), neutral {dij = 0), or negative {dij = —1). 
The local trust value Sij is defined as the sum of the ratings that i has given to j. 
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This aggregated feedback is then normalized in order to obtain a Markov chain. Formally, the normalized 

max(sij , 0) 



local trust value Uij is defined as follows 
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(2) 



Normalized local trust values can be organized in a matrix [a^] (the so called pattern matrix). In flow 
models the reputation vector (the vector containing all reputation values) corresponds to the steady state 
vector of a Markov chain; one starts with a vector of initial reputation values and then repeatedly applies 
the Markov step until a stable state has been reached using the following equation 
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Q,^T^(fe) _^(^^_ ^-)p (.3) 



-Direct reputation is also called subjective reputation 1271 or local trust value 1231 . 



where r is the reputation vector, A is the pattern matrix, p is a vector of initial reputation values, and 
a € [0, 1] is a damping factorr] 

Unfortunately, the current state of affairs regarding this kind of reputation model is not very satisfactory. 
First of all, the approach itself has a drawback. In the Markov chain approach, reputation values need to 
be normalized in the sense that they add up to 100% (J2|i. The problem is that such reputation values carry 
relative information only. Applying ([T]i and Q to the two scenarios presented in Fig. [T] we obtain in both 
scenarios that Bob has normalized local trust value equal to 0.1, Charlie has 0.9 and David has 0. This 
is good enough for ranking, but when an absolute measure is required, the Markov chain approach fails. 
Actually, one may expect that Bob and Charlie have a similar reputation value in the first scenario; also 
that the reputation value of Bob in the second scenario is greater than the reputation value of Charlie in 
the first scenario. In addition, when entities have a similar reputation value, it is impossible to see whether 
they are all trustworthy or all untrustworthy. Suppose a scenario (i) in which Bob and Charlie receive ten 
positive ratings out of 1000 transactions from Alice and a scenario (ii) in which Bob and Charlie receive 
900 positive ratings out of 1000 transactions. In principle. Bob and Charlie should have neutral reputation 
in (i) and strongly positive reputation in (ii). However, because of the normalization in (|2]i, from Alice's 
perspective Bob and Charlie have normalized local trust value equal to 0.5 in both (i) and (ii). 
Furthermore, implementations of flow models ignore a significant part of the available information: while 
ratings are positive, negative or neutral, their aggregation ignores the negative values and maps them to 
zero. For instance, EigenTrust ll23l takes the sum Sij of the ratings of all transactions between entities 
i and j, and normalizes it with respect to the sum of all the positive ratings given by i (see (|2|). As a 
consequence, it is not possible to discriminate between users that have bad reputation and users that have 
neutral reputation. Consider, for example, the local trust value of David in the two scenarios of Fig.[T] by 
applying (fill, we obtain —900 in (a) and in (b). However, after normalizing using (|2]), we obtain in both 
scenarios. 

Finally, the metrics based on Markov chains include parameters which aim to guarantee the convergence 
of computations and to resist malicious coalitions (e.g., the damping factor a and the vector of initial 
reputation values s in ([3])). Unfortunately, the impact of these parameters on reputation values has not been 
studied in sufficient detail. 

3 Our reputation metric 
3.1 Reputation model 

Reputation is a collective measure of trustworthiness based on the judgment of a community. The users in 
the community can interact with each other and rate the counterpart in the transaction after the completion 
of the transaction. The reputation value of a given user is computed by aggregating the ratings that other 
users in the community gave to that user and reflects the level of trust that they have on the user on the 
basis of their past experience. In the remainder of this section, we discuss the rating system, the method 
for aggregating ratings, and the metric for calculating reputation values from the aggregated ratings. 
Ratings are collected by a central authority using a rating system. We adopt a rating system where ratings 
are bounded to the corresponding transaction. Ratings can be positive, negative, and neutral; we do not 
impose any restriction on the range of values of ratings. 

The central authority aggregates ratings in order to compute the reputation values of all users involved in 
the system. We assume that aggregated ratings lie in the range [0, 1] where 1 means very good, very 
bad, and ^ neutral. The restriction to [0, 1] does not affect the generality of the model: values lying in a 
different interval (and even qualitative values) can easily be mapped to [0, 1]. In this way, all the available 
information (including negative ratings) can be used in the computation of reputation. 
A number of factors should be taken into account when ratings are aggregated (see PI ISTllMl ): 

• the ratings a user receives from other users, 

• the total number of ratings a user receives from other users. 



^Note that a is different from tiie 'a' in PageRank (9j, wliose purpose is to modify the matrix. 



• the credibility of the rating source, 

• the size of the transaction, and 

• the time of the transaction. 

Several aggregation methods based on (some of) these factors have been proposed. In ifTSJI ratings are 
aggregated by summing the positive and negative ratings that the user receives from other users. However, 
it is well known that methods based only on ratings are flawed fT2] [MJ. Indeed, a user can increase 
his reputation by increasing the transaction volume to cover that fact he is cheating at a certain rate. In 
particular, the user can build a good reputation in small transactions and then act dishonestly in large 
transactions (VJ). To prevent this, an aggregation method should also take into account other factors like the 
total number of the transactions in which a user is involved and the size of the transaction. In addition, some 
existing reputation systems use threshold functions for accurate discrimination between trustworthy and 
untrustworthy users (3T\. In particular, the ratings provided by a user are considered only if the credibility 
of the user is greater than a certain threshold. To discriminate between past and recent behavior, some 
reputation systems update reputation by aggregating the previous reputation with a factor indicating the 
proximity of the recent rating to the past reputation |4|. 

The following example presents a simple method for aggregating ratings that incorporates the ratings a user 
receives from other users, the total number of ratings, and the criticality of the transactions. Intuitively, the 
aggregated ratings are defined as the weighted ratio of the sum of positive and negative ratings averaged 
over the total criticality of transactions. In the example, we do not consider the credibility of the rating 
source because this factor is used later in (|5]) to calculate reputation values from aggregated ratings. In 
(J5]l, the credibility of a user is given by the reputation of the user We refer to |4| for an example of a 
time-sensitive aggregation method. 

Example 1 Consider the electronic marketplace scenarios of Fig. 17] Let Vxy be the set of transactions 
between users x and y, let q : V^y — > {1, 0, — 1} i>e a function that returns the rating given by y to xfor the 
transaction and w : Vxy — >■ N a function that assigns a criticality value to the transaction. The aggregated 
ratings A^y can be computed as the sum of individual ratings weighed with respect to the criticality of the 
transactions and then mapped into the range [0, 1] as follows 

^ q{v)w{v) 
. 1 1 veVxy 

veVxy 

If we apply (Bll to the scenarios of Fig. [7] (and assuming that all transactions have the same criticality 
value), we obtain that the values computed by aggregating the ratings given by Alice to Bob, Charlie, and 
David in (a) are equal to 0.5005, 0.5045, and 0.05 respectively, whereas scenario (b) gives 0.55, 0.95, and 
0.5 respectively. These values are closer to what one would expect than the results off^, namely 0.1, 0.9, 
and for Bob, Charlie, and David respectively in both scenarios. (Here 1 means very good and bad). 

The set of all aggregated ratings A^y can be organized in a matrix. We refer to Table fl] for the notation 
used hereafter 

Definition 1 For n users, the aggregated ratings are contained in an irreducible n x n matrix A, 

• Axy e [0, 1] for X ^ y; 

• Axx = 0. 

Axy represents the aggregated ratings of user x from the perspective of user y. We impose that self- 



reference are not included in the aggregation (Axx = for all x). This choice is motivated in Section 3.2 



where we show that a nonzero diagonal has undesirable consequences in a simple toy scenario. In Sec- 



tion 6.4 we present numerical results on the effect of self-reference. 



Notation 


Meaning 


n 


Number of users. 


H 


The set {1, ■■■,?!}. 


r e [0, 1]" 


Column vector containing all reputations. 


s e [0, 1]" 


The 'starting' reputation vector. 


Axy e [0, 1] 


Aggregation of ratings of x given by y. 


a e [0, 1] 


Weight of the indirect evidence. 


e 


The 71-component column vector (1, 1, ■ ■ ■ , l)""". 


ee [o,n] 


The 'norm' e^r. 


Vi 


The i'th eigenvector of A. 


Ai 


The i'th eigenvalue of A. 


Amax 


Largest eigenvalue of A. 


Vmax 


Eigenvector corresponding to Amax- 


c 


The nx n constant matrix C = ee""". Cij = 1; C* = rJ^'^C. 



Table 1: Notation 



To compute reputation, we employ a metric that is an adaptation of the metrics in 1231 [3T1 . In particular, we 
adopt the equation proposed in |31 1 (see (J5])), which differs from the one proposed in ||231 (see ([3]l) in the 
moment when the normalization step takes place. In ll23]| normalization is done once at the beginning in 
order to obtain a Markov chain using (|2]); then, starting from a vector of initial reputation values, Markov 
steps are repeatedly applied until a stable state has been reached. Conversely, in |31 1 reputation values are 
normalized with respect to the sum of all reputation values in the reputation vector (^^ r^ in ds])) at every 
iteration to guarantee that reputation values stay in the range [0, 1]. We differ from the metric proposed 
in OTi in the way the indirect evidence matrix A is defined: in |[3T1 A is symmetric (whereas we allow 
asymmetry), and A^x = 1 (whereas we set A^x = 0). 

We consider a system with n users. The central authority determines the trustworthiness of all users based 
on his direct experience with them and the aggregated ratings. 

Definition 2 Let s e [0, 1]", with s ^ 0, be a 'starting vector' containing starting values assigned to all 
users by the central authority. Let a G [0, 1] be a weight parameter for the importance of indirect vs. direct 



evidence. We define the reputation vector r G [0, 1]' 
equation: 

rx = {I- a)sx 

where we have introduced the notation £ = ^^ r^. 



as a function of a, s and A by the following implicit 

(5) 



y£[n] 



^A 

p ^xy 



Eq. (J5]l can be read as follows. If the central authority wants to determine the reputation of user x, it first 
takes into account the direct information that it has about x. From this it computes Sx, the reputation that 
it would assign to x if it had no further information. However, it also has the aggregated data in A. It gives 
weight 1 — a to its 'direct' assignment s and weight a to the collective result derived from A. If it did not 
have any direct information about x, it would compute r^ as Tx = '}2y{''^yl^)^xy, i-e. a weighted average 
of the reputation values Axy with weights equal to the normalized reputations of all the users. Adding the 
two contributions, with weights a and 1 — a, we end up with pi, which has the form of a weighted average 
over all available information. Note that (J5]l can be expressed in vector notation as 



Ay 

(1 - a)s + a^^, 



e^r 



(6) 



where e stands for the ?i-component column vector (1,1, •••,1)^ 



3.2 Discussion of self- references 

The quality of a reputation metric is determined by the accuracy of reputation values. Here, we provide 
further motivation for our metric and, in particular, for the choice A^x = 0. We demonstrate that the 
reputation values calculated by our reputation metric are close to the expected values. 
The expression for Tx contains a term a{rx/i)Axx, the as yet unknown reputation of x multiplied by his 
'self -rating' Axx- We briefly investigate the effect of self-reference on our reputation metric. First we look 
what happens when the diagonal of A is not set to zero but to ^ G [0,1]. For large n and random A one does 
not expect a significant effect, since the diagonal consists of only n elements out of n^. (See the numerical 



results in Section 6.4 1. We consider the following scenario, which we tailored to make the diagonal stand 



out: Everybody agrees that only one user is reasonably trustworthy (let us call him user 1). Let e ^ 1 be 
a small positive constant. Let cr be a positive constant of order L We set Axy — e f or a; ^ {1, J/} and 
Aiy = b £ [0, 1] for all y ^ 1- We set Sx = ere for x ^ 1. Because in this scenario all the users except 
user 1 are treated equally, (J5]l yields the same reputation for all users x ^ 1, which we will denote as rj-ost- 



/C h b 

e Q e 



\e 



e 



C e 
e C/ 



GE 



\oe} 



r = 



^rcst 
V J'rest / 



(7) 



From a good metric we expect that user 1 has reputation (1 — a)si + 0(e) and that frest is of order 
e, preferably rjost = (1 ~ a)ae + as. Substitution of (jTl) into (J5]l yields, after some algebra, r\ = 
(1 — a)si + aC + 0(e) and rj-cst — ^ {[^^ch ~'~ ^(^^)- Clearly, our expectations are met only if C = 0. 
One could argue that setting the diagonal of A to zero is not enough to remove self-references completely: 
in the computation of Tx the normalization factor i = e^r still contains Tx, i.e. Vx affects the weights for 
the computation of r^.. In order to avoid this, one could define an alternative reputation metric t as 



tx = {\- a)Sx 



Ax 



j/e[n]\K 



L7, 



ze\n\\x^^ 



(8) 



For large n and general A, the differences between (|8]l and (|5]l are tiny. However, substitution of the special 
scenario (|7| into dsll gives ti = (1 — a)si+a6+C'(e) and ijost = (1 — a)cre + ae. While ij-ost is as desired, 
tx is not. There is a significant difference between ii and the desired outcome (1 — ci)s\ +0(e\ especially 
when b is large. As a special case consider si <C 6, a situation where the central authority mistrusts user 1, 
but all the users trust him. The authority does not want his result for user 1 to be influenced heavily by the 
users, since their reputations are 0(e). 

We conclude that the metric r works best when Axx = is imposed, and that r is better than the metric t. 
Here 'better' means that it more closely matches our expectations of how a metric should behave. 



4 Formal properties 

The implicit function (J5]l can be shown to have a number of desirable properties. In particular, for any 
choice of a, s, A allowed by Definitions [T] and l2] there always exists a well defined, unique solution r e 
[0, 1]". This result is fundamental in collaborative systems in which parties rely on the reputation values to 
make a decision. 

In this section, we first introduce some notation and list a number of useful lemmas. We discuss the trivial 
solutions for a = and a — 1. Then, we present a proof of existence and uniqueness of the solution r 
for the general case < a < 1. Finally, we compute the derivative of r with respect to A. This provides 
a way to study the sensitivity of the reputation metric to malicious changes in the indirect evidence matrix 
(SectionlO). 



4.1 Notation and lemmas 

For a vector or a matrix, the notation 'V > 0' means that all the entries are nonnegative. For other notation 
we refer to Table [T] 

Lemma 1 //r is a solution q/Q satisfying r > 0, then r G [0,1]". 

The proof is given in the Appendix. 

Lemma 2 For given a, s, A and a given ^ G [0, n], such that det(i?l — a A) ^ 0, there can exist at most 
one vector r G M" that satisfies (pi) and e"'"r = i. 



Proof: Let i = e^r. Eq. ^ can be rewritten ; 



1^ 



(9) 



r = xi{t} := (1 - a) 

This fixes the vector r uniquely as a function of the scalar t. D 

Given a solution r. Lemma l2]tells us that a nontrivial permutation of r cannot be a solution. 

Lemma 3 (Theorem 1.7.3 in Ref [5]) Let M > be a square matrix. Then M has a positive eigenvalue 
Ainax which is equal to the spectral radius. There is an eigenvector v^ax > associated with Amax- For 
X > Amax it holds that (xl — A)^^ > 0. 

4.2 The special cases a = and a = 1 

The case a = trivially yields r = s. The case a = 1 is more interesting. Eq. l5] reduces to 

Ar = (e'^r) r. (10) 

This has the form of an eigenvalue equation. The matrix A has eigenvectors v^, and eigenvalues A,. There 



exist n solutions of ( 10 1, namely 



rW=A,^, (11) 

i.e. proportional to the eigenvectors of A. However, the Perron-Frobenius theorem for nonnegative irre- 
ducible matrices (see e.g. Ref. |6|, Chapter 2) tells us that only one of the eigenvectors gives an acceptable 
reputation vector: v^ax > 0. All the other eigenvectors have at least one negative entry. We are left with a 
single solution, 

Ata=l: r = A^ax t"'''' ^"'^ ^ = Amax- (12) 

4.3 The general case < a < 1; Main theorems 

Multiplying p\ from the left with e^ and then multiplying by a suitable constant gives 

fie) = l where f{e):^{l-a)e^{il-aAy'^s. (13) 

This equation helps us to prove several important properties of our metric. First, we demonstrate that (J5]l 
has always a well defined, unique solution in the general case < a < 1. 

Theorem 1 For a, A, s as given in DefinitionsU\andp\ there exists a reputation vector r G [0, 1]" satisfy- 
ing (pi). The solution is of the form r — u{£^,) with u the function defined in ([9]) and l^, G (aAmax, "]• 

Corollary 1 In the limits a — )■ and a — >■ 1, ( |73| ) and ([9]) correctly reproduce the reputation vector for the 
special cases a — Q and a — \. 

Theorem 2 The solution in TheoremUjis the only solution of pi satisfying r G [0, 1]". 



The proofs of Theorems [T] and |2] and Corollary [T| are given in the Appendix. 

The quality of a reputation system is determined by how accurately the computed reputation predicts the 
future performance of entities even when attackers attempt to manipulate reputation values. The following 
result allows us to study the effect of unfair ratings by analyzing the sensitivity of reputation values to 
changes in the indirect evidence matrix. 



Theorem 3 For fixed a and s, a small change in A affects r as follows: 



dA 



zy 



ei- 



aA+ -jAre^ 



(14) 



(Here [■ ■ ■]^^ stands for element xz of the inverse matrix.) 



The proof of Theorem[3]is given in the Appendix. 

Theorem l3] gives some direct insight into the effectiveness of attacks. First, we see that the effect of the 
attack is proportional to a. Furthermore, if some user y wants to attack the reputation of user x, the most 
obvious attack is to reduce the matrix element A-^-y, i.e. (6A)xy < 0. We see in (14i that the effect is 
proportional to ry. Hence, the effectiveness of his attack is proportional to his own reputation. (Of course 
this does not come as a surprise, but it is good to see intuition getting confirmed.) From this we see that it 
is advantageous for him to improve his own reputation before attacking other users' reputations. 



Finally, from (14 1 we can also read off a less obvious attack strategy. The attacker y may also indirectly 
attack X by manipulating A^y, where z is some other user The effect of this attack is proportional to the 
matrix element E^z '■= [^1 — cuA + j Are^]^^ . In practice, user y's attack on x could look as follows. He 
computes E^^ for all z, z ^ y. He picks a number of users z whose E^^ have the highest magnitude. For 
each of them, if E^z < 0, he causes a positive change in Azy, otherwise a negative change. Remark: This 



reasoning applies for small changes of SA. In the numerical experiments (Section 6.5 1 we take a worst case 
approach and allow the attacker to make big changes in A. 



5 Computing reputation 

From the structure of Lemma l2] and the proof of Theorem [T] we can derive a direct method (Fig. |2|i for 
computing r from a, s, and A. This algorithm first solves (13]l for i, obtaining a solution £^, > aAmax 
(lines 1-3). The equation f{£) = 1 is a polynomial equation of degree n; this becomes evident if we write 
Aas A — QAQ^^ (with A the diagonal matrix containing the eigenvalues of A, and Q the matrix whose 
columns are the eigenvectors v^) and multiply ( 13 1 by det(^ — aA): 



[|(£-aA,) = (l 



i=l 






ieN\{'} 



(15) 



The highest order on the left hand side is i", and on the right £" ^. The algorithm first completely solves 
the eigensystem of A (lines 1-2) and then solves ( 15 i, looking only for the unique solution £^ > aAmax 



(line 3). Finally, it substitutes that value into (J9]) (line 4). TheoremfTlguarantees that the outcome is a vector 

in [0, 1]". 



1 {Ai} = Eigenvalues(A) 

1 Q = Eigenvectors(A) 

3 Find l, > aAmax that solves |l5| 

4 r= (1-a) |"l- -^aV^ 



Figure 2: Direct method 



1 r(0) = s 

2 repeat 



E. 



^C") 



4 rC'+i) = (l-Q)s + ar('=+i) 

5 dif = ||r('=+i)-r(''-)|li 

6 until diff < 5 

Figure 3: Iterative method 



An iterative method for solving (J5]l is presented in Fig. [3] This algorithm first computes reputation as the 
weighted average of reputation values in A (line 3). Then, it calculates the average over direct and indirect 
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evidence using a, 1 — a as weights (line 4). The algorithm repeatedly computes the reputation vector until 
it converges, that is, the difference between the new state r^'^^^^ and the previous one r''^) is less than a 
certain threshold (lines 5-6). Notice that the termination condition corresponds to 



{l~a)s + a TTT-r^*-" 



< S. (16) 



In Section 6.2 we show numerically that the two algorithms find the same solution. 



6 Numerical experiments 

In this section, we assess the performance of our metric for different choice of the parameters a and s and 
discuss how these parameters can be used to mitigate the impact of attacks on the reputation system. In 
particular, we first discuss our choice for the A-matrix. We compare the performance of the algorithms for 
computing the reputation vector r presented in SectionIS] Then, we investigate the effect of a and s as well 
as the effect of self -ratings on r. Finally, we discuss attacks and their effectiveness. 

6.1 Generation of the matrix A 

To study our metric, we simulate a characteristic marketplace scenario. Our scenario consists of a number 
of users who can interact with each other and rate the party with whom they interact after a transaction. 
In our experiments, we also investigate the robustness of the metric against different threat models which 



describe typical attacker behavior. Threat models will be described in Section 6.5 



In order to simulate a realistic scenario, we generated random A-matrices as follows. 

1. All non-diagonal elements of A are initialized to |. This is the 'neutral' value for users who have 
not yet interacted with each other 

2. For each user i, a value r^ e [0, 1] is drawn from a triangular probability distribution (j{t) that has 
(t(0) = 0, cr(l) — 0, and apeak o'(Tinax) = 2. The number r.; serves as the 'intrinsic' trustworthiness 
of user i. We have a group of experiments with varying r,„ax to show its effect and, otherwise, Tmax 
is set to 0.6 as the representative value. 

3. We fix a number / e (0, 1), the 'filling fraction'. We randomly generate /(ti^ — n) user pairs 
{xa, Va), with Xa ^ Va- Thcsc pairs represent past interactions between the selected users, where ya 
judged Xa- We set / — 0.3. 

4. For each of the pairs (x, y) the matrix element A^y is assigned a random value uniformly drawn from 
the interval [maxlr^: — 0.1, 0}, minJTa; + 0.1, 1}]. This step simulates the fact that the judgment of 
X by 2/ is mostly determined by the intrinsic trustworthiness t^, while allowing for some noise. 

We consider this set-up acceptably realistic for the following reasons. First, for large n it is unlikely 
that every user has interactions with everybody else. Only a fraction / < 1 of the matrix gets 'filled'. 
Second, the direct opinion about a user is the result of interactions with him. Someone's opinion about 
X depends mainly on the behavior of x (whose intrinsic trustworthiness is modeled as t^), and also on 
other circumstances, which we model as small-amplitude random noise. Our choice of a triangle-shaped 
probability distribution for t is motivated by the wish to keep the model as simple as possible while still 
containing the necessary ingredients. 

6.2 Comparison of computation methods 

We implemented the algorithms presented in Figs.l2]andl3]in Wolfram Mathematica 7.0. It turns out that 
the iterative method (Fig.l3]l is faster than the direct method (Fig.l2]l at the same level of accuracy. This is 
hardly surprising, since the heaviest operations in the iterative method are the repeated matrix-times-vector 
multiplications (order n? times the number of iterations), while the direct method involves solving the 
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Figure 4: Logarithmic plot of the difference ||ri — r2| 
experiments, as a function of a. From bottom to top n — 



I between the two algorithms, averaged over 20 
50, 100, 200. 



whole eigensystem of an n x n matrix. We did a number of experiments where we solved r with the direct 
method, using Mathematica's default machine precision. This gave 



rx 



Av 

(1 - a)s + a^^ 



< 10 



-15 



(17) 



We did the same experiments (same A, a, s) with Alg. 2, with S = n ■ lO^^'^. This S is tailored to yield the 

same accuracy as ( fTT] ), as can be seen by comparing ( [T7| to ( [T6| . 

For n we took the values 50, 100, and 200. The number of required iterations is then typically 12 or less, 

and decreases with growing n. For every n and a we took 20 different A-matrices. Fig. |4] shows the 

distance | |ri — r2 1 1 1 averaged over these 20 experiments, where ri, r2 are the solutions found by the direct 

and iterative method respectively. Clearly ri and r2 are almost identical. 

The results presented hereafter were obtained using the iterative method. 

6.3 Impact of the parameters 

The objectives of this set of experiments is to evaluate the impact of parameter a and initial reputation 
vector s on the reputation vector r. 



The parameter a. Figs. 5(a) and 5(b) show i and selected components of r as a function of a. The 
linearity in these graphs is surprising, since we know from (J5]) that r is not strictly linear in a. (Close 
inspection of Fig. 5(b) indeed shows that the data do not precisely lie on straight lines.) Yet r is quite close 
to a linear interpolation between the a = and a = 1 solutions. 



r w (1 — a)s + a 



max V max 



(18) 



This result is independent from the choice of T,nax as shown in Fig. 5(a) As expected, T„iax has an impact 
on the average reputation of peers within the system {£/n). Fig. 5(b) demonstrates how a pre-trusted user 



(a user who has initial reputation equal to 1) can lose his leading position when a increases, as gradually 
more weight is given to A than to s. 

As we discussed earlier, a serves as weight for direct versus indirect information. Accordingly, the system 
owner should choose the value of a on the basis of his confidence in the information he initially has. 
Suppose, for instance, that he is confident that a user x is trustworthy, but his reputation r^ turns out below 
average. This may arouse suspicion that some malicious user is attempting to subvert the system. In this 
case, he should select a low value of a to reduce the influence of the information provided by users on the 
computation of reputation values. At the same time, setting a to would make it impossible to capture 



the dynamics of the actual user behavior. The study of the behavior of the components of r (Fig. 5(b) i can 
assist the system owner to select a in such a way that x keeps his high ranking, while information provided 
by users is still taken into account. 
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(a) 



(b) 



Figure 5: Dependence of the reputation on a. n = 500 and s = (1,0, •• • , 0) . (a) For every a, 50 
random A were taken, and only the min., max. and average (./n are plotted, (b) One fixed random A. The 
downward curve is ri. The other plotted components are those with the minimum, maximum and median 
reputation at a — 1, plus two more in between. 






(a) 



(b) 



(c) 



Figure 6: Dependence of the reputation on s. n — 1000, a — 0.9, one (typical) fixed random A, s = ce. 

(a) Trnax = 0.2; (b) Tmax = 0.6; (c) Tmax = 0.9. 



The parameter s. We studied the dependence on s in two ways: (i) We set s = ce, with c € [0, 1], 
i.e., the initial reputation given by the system owner is equally distributed among all users (Fig. l6]l. We 
repeated the experiments for different Tmax values, (ii) We set s = (1, 1, • • • , 1, 0, • • • , 0)^, varying the 
number T of Is in the vector (i.e., the number of pre-trusted users). Fig.lTlshows several components of r 
for these two cases. For s = ce, the linear behavior of r as a function of c is hardly surprising, in view of 
the approximation ( 18 1 which is linear in s. Fig. l6] also shows that the average reputation of peers within 
the network increases with the increase of T,nax- Case (ii) shows jumps as a function of T, and even the 
ranking changes occasionally. This can also be understood from ( fTS] ). When an extra user x is included in 
the pre-trusted set, the main effect is a jump in r^ of size w 1 — a, with only minor changes to the other 
reputations. This result demonstrates that the effect of selecting pre-trusted users wrongly can be mitigated 
by increasing a. 

In summary, the starting vector s has a clear effect on the reputations r, which is well described by the 
linear approximation ( [T8] l. In particular, s makes r less sensitive to changes in A. The pre-trust that the 
central authority puts in users is carried over (multiplied by a factor 1 — a) into the reputation vector r. 



Some guidelines for choosing s are given in Section 6.5 



6.4 Effect of self-references 

As discussed in SectionlS] we set A's diagonal to to minimize the effect of self-references. In this section 
we study what happens when the diagonal is set to 1 (the strongest possible departure from A^x — 0). 
In our experiments, we create an A matrix for different n (from 10 to 500) and calculate reputation when 
the values on the diagonal are (obtaining Tq, with norm ^g — e^rp) and when they are 1 (obtaining ri, 
with norm ii — e'^ri). We use the relative change A^/^q — (^i ~ ^o)/^o ^s a measure of the influence 
of self-references. We performed 20 experiments for each n and for a — 0.1,0.5,0.9; for each set of 
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Figure 7: Dependence of the reputation on s. n — 1000, a — 0.9, one (typical) fixed random A, s 
(1, • • • , 1, 0, • • • , 0), with T pre -trusted users. 
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Figure 8: Effect of self-rating. The average change /\£/£q as a function of n (for various values of a) 
plotted on a log-log scale. The slope —1 indicates that A^/^q oc n^^ for large n. 

experiments, we determined the average change. 

Fig. Is] shows the average percentage change in ^ as a function of n. We observe that the magnitude of 
the change is inversely proportional to n, and hence becomes negligible for large n. The effect of self- 
references is non-negligible at small n. For n between 10 and 200, it varies between 20% and 1%. Fur- 
thermore, we observe that ri = (1 + ^)ro. This proportionality is explained in the Appendix. As a 
consequence, ^i — ^o = ct- Notice that this result is constant, independent of n. In contrast, £o depends 
on n linearly in most cases. (For specially crafted A and s, such as the scenario in Section [X2] £q may be 
independent of n.) This explains the inverse-n proportionality of the percentage change A£/fo. 

6.5 Robustness Against Attacks 

In our model, we assume that malicious users can compromise the integrity of r only by manipulating A 
(i.e., by providing unfair ratings). They can influence neither a nor s. Attacks on the computation process 
and dissemination are out of scope in this paper. Whitewashing attacks are not critical, since new users get 
neutral entries in A. We assume that A is publicly known. How strongly A can be manipulated depends 
on the actual feedback aggregation method. This can be a slow and/or costly process, e.g. if it involves 



feedback on transactions. In Section 4.3 we described the effect of a small change in A and discussed how 
an attacker can exploit such changes to affect the computation of reputation. Here, we present some threat 
models that are inspired on such a malicious behavior: 

Self-promotion: The attacker's goal is to improve his own reputation. He can do this by giving (i) positive 
feedback to users who have given him positive ratings, and (ii) negative feedback to users who have 
given him negative ratings. 

Slandering: The goal is to ruin the reputation of a target x. The options are giving (i) negative feedback 
to X, (ii) positive feedback to users who have given negative ratings to x, and (iii) negative feedback 
to users who have given positive ratings to x. 
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Sybil attacks: An attacker creates new accounts. These give positive feedback to him and to each other in 
order to improve their reputation and abihty to influence r. Then, a slandering attack can be launched 
with the help of the new accounts. 

Let t be the number of attackers. Without loss of generality we can group the attackers together and let 
them he in {n — i + 1, • • • , n}. Then, A is of the form 



/ 



A 



honest judged 
by honest 



attackers judged 
by honest 



honest judged ' 
by attackers 



attackers judged 
by attackers 



(19) 



Only the right hand part of the matrix can be influenced by the attackers. A Sybil attack enlarges A by 
adding rows at the bottom and columns at the right. 

Self-promotion experiments. The objective of this set of experiments is to evaluate the effectiveness and 
robustness of the reputation metric against self-promotion attacks. Consider one attacker y. He modifies 
Axy to 1 if Ayx > 0.5 and to if Ayx < 0.5. (He tries to boost the reputation of those that have a high 
opinion of him, and to reduce the reputation of the rest.) The effect Avy of such an attack is shown in 
Fig.l9] The plotted data are calculated for one random A, but we have performed many such experiments; 
the presented results are typical for the whole ensemble of random matrices. We chose an attacker y with 
ry < 0.5, i.e. a user with less than neutral reputation who actually needs the attackr] 
Clearly the attacker has little effect; his opinion is only one of many. As expected, Ary grows with a, since 
any change in A gets weight a in the computation of the reputation. We also see that the choice of s has 
a nontrivial impact. In particular, the larger the (total) reputation that the system owner initially gives to 



users, the smaller the effect of the attack. Both Figs. 9(a) and 9(b) show a nonlinear dependence of Ary on 
the components of s. Note that the attack strength for c ^ 1 is not the same as for T ^- n — 1. In particular, 
Sy is not the same in these cases: in the first case the attacker has initial reputation Sy = c, whereas in the 
second case Sy — 0. This turns out to have a noticeable effect. 

In summary, the most effective countermeasure for mitigating self-promotion attacks is to decrease a. An- 
other strategy would be to enlarge the set of pre-trusted users. It is worth noting that this result contradicts 
the suggestion given in 1 23 1 to choose a very small number of pre-trusted users. However, if the attacker is 
included in the set of pre-trusted users, the power of the countermeasure is reduced. 





Figure 9: Effect of self-promotion attacks, n ~ 200, single fixed A. The difference in the attacker's 
reputation is shown (a) as a function of c, for s — ce; (b) for s = (1, • • • , 1, • • • , 0) as a function of the 
number T of pre-trusted users. (The attacker is not one of the pre-trusted users.) 



An attacker with r-y > 0.5 has a bit more effect jl4|, unless ry is close to 1, where no more improvement is possible. 
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Figure 10: Effect of the slandering attack, n = 100. The difference Ar^ in the target's reputation is 
shown (a) as a function of c, for s = ce; (b)for s — (1,- ■■ ,1,0 ■■■ ,0) as a function of the number T of 
pre-trusted users. (The attacker and target are not part of the pre-trusted users.) 



Slandering experiments. Again consider one attacker y. His target is x. He sets A^y = 0, and also 
makes the following modifications: for z ^ {x, y} he sets A^ij = 1 if A^z < 0.5 and Azy = otherwise. 
(He tries to boost the reputation of those who have a bad opinion about x, and to reduce the reputation of 
the rest.) The effect is shown in Fig. 10 with the dependence on s presented in the same way as in Fig. [9] 
We observe that the effect of this attack is roughly ten times stronger than in the self-promotion attack. The 
difference lies in the fact that the attacker y can directly manipulate A^y in the slandering attack, while 
there is no such possibility in the self-promotion attack [Ayy is fixed). We studied the magnitude of the 
direct and indirect components of the slandering attack separately. The results (not reported here due to the 
lack of space) show that the direct attack is stronger than the indirect one (w ten times). 
It is worth noting that the curves in Fig. 10(a) are almost flat, i.e. for s = ce the effect of the direct attack on 



Tx via Axy is almost independent on c. We suspect (but cannot yet substantiate) that the c largely disappears 
due to the normalization that is inherently present in the definition of the metric (and which is most clearly 
visible in step 3 of Algorithm 2) in combination with the fact that all users, including the attacker, are 
pre-trusted. In contrast, the curves in Fig. 1 10(b)] are comparable to the ones for the self -promotion attack 
when y is not pre-trusted (i.e., Sy = 0). 

In summary, the countermeasures for mitigating slandering attacks are similar to the one for self-promotion 
attacks. However, differently form the self-promotion attack, the inclusion of the attacker in the set of pre- 



trusted users would make the countermeasure completely ineffective as shown in Fig. 10(a) 





(a) 



(b) 



Figure 11: Effect of the Sybil attack. The percentage of the target's reputation reduction is shown as a 
function of m. In (a) we fixed a — 0.9 and varied the number of pre-trusted users T. From bottom to 
top T = 10, 50, 100. Before the attack, the target's reputation is 0.56 for all these values ofT. In (b) we 
fixed T ~ bO and varied a. (The attacker and sybils are not part of the pre-trusted users; the target is pre- 
trusted.) Before the attack, the target's reputation is 0.89, 0.74 and 0.53/or a = 0.2, 0.5, 0.9 respectively. 



Sybil attack experiments. We consider one attacker y who creates n ■ m extra accounts ('siblings') 
n-\-l,- ■ ■ ,n + nm. His main aim is to decrease r^ for some fixed target x. To this end, all the siblings give 
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negative ratings to the target and positive ratings to each other and to y. This corresponds to set A^a = 0, 
Ag-cr' = 1, Ay„ = 1, where cr, cr' > n and a' ^ a. Furthermore, they gives positive ratings to those users 
who have rated the target negatively and negative ratings to those users who have rated the target positively. 
In our model, this corresponds to set A^a to 1 for those z ^ x^y that have A^z < 0.5, and to otherwise. 
We started with n ~ 200 and in each experiment we increased the size of A by adding pseudonyms to the 
set of users such that the pseudonyms make up between 0% and 120% of the original number of users. The 



effect of the Sybil attack is shown in Fig. 1 1 Clearly the attack is much more effective than the slandering 
attack in Fig. 10 As expected, the effect grows with the numbers of siblings. Notice that a very large 
number of siblings is required to significantly reduce rx',atrn — 1 (as many siblings as original users) still 
about 40% of the target's reputation remains. 

In Fig. |ll(a)| we can see that increasing the number of pre-trusted users helps to improve the robustness 
against Sybil attacks; however, the choice of the starting vector s has little effect on the attack. Fig. |1 1(b)] 
shows that a has a nontrivial impact. Indeed, with small values of a we give less weight to A and, as 
consequence, the attack is less strong too. Finally, both figures show that the effect per added sibling is 
strongest for m smaller than approximately 0.6, and for larger m saturation sets in; more and more siblings 
have to be added to obtain significant effect. 

7 Conclusions 

We have presented a flow-based reputation metric for aggregated feedback. The metric gives absolute 
reputation values instead of merely a ranking; it also makes use of all the relevant information without 
discarding any part of it, leading to reputation values with better discriminating capabilities. We have given 
a proof that there is always a solution and that it is unique. We have also compared different methods for 
computing the reputation vector, and studied the properties of the metric numerically, focusing in particular 
on how attackers can manipulate reputation values. We have analyzed the impact of the initial reputation 
vector s and the weight parameter a on r. It turns out that the reputations depend on a in a surprisingly 
linear way, although the equations are nonlinear. They interpolate between the known solutions at a = 
and a — 1, with small deviations from a straight line. The direct information plays an important role (also 
for the ranking) even when little weight is given to it. 

We have also studied how these parameters can be used to make the reputation metric more robust against 
attacks. The attacks can be direct (attacker y manipulates A^y for target x) as well as indirect (manipulating 
Azy for other users z y^ x). A Sybil attack increases the effectiveness. The most evident result is that the a 
parameter has a much stronger effect on the robustness than s. Robustness against attacks and in particular 
Sybil attacks is obtained by choosing a smaller a. However, a balance must be kept between resisting 
attacks and making constructive use of the information provided by users in the A matrix. In particular, 
a must not be chosen too small because there is a danger from choosing a wrong s. Setting a larger a, 
the effect of choosing the wrong pre-trusted users is mitigated, as the choice of s hardly matters. This is 
demonstrated by the jumps in Fig. IT] which have size 1 — a. 

In this paper, we have mainly focused on the mathematical model of the reputation metric. In particular, we 
have studied its properties both analytically and numerically, which allows the specification of guidelines 
for making the system more robust against attacks. An interesting challenge for future research is to study 
whether those properties are preserved in the computation dimension, in particular, when there does not 
exist a centralized authority computing reputation values, but the computation is distributed across the users 
of the system. 
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Appendix 

Proof of Lemma [T] 

Eq. (J5]l states that r is the weighted average of two vectors; the first of these vectors is s G [0, 1]"; the 
second vector is g := Ar/i. The weights are a and 1 — a respectively. We have Qa = '^hi''"bli)Aab, i-S- g 
is the weighted average of all the columns of A, with weights r^/^; these weights are all nonnegative and 
add up to 1. Hence, since Aij G [0, 1], the g satisfies ga G [0, 1] for all a. From the fact that both s and g 
have entries only in [0, 1], it follows that their weighted average has the same property. D 

Proof of Theorem [H 

First we prove the existence of a solution. For (. > aAmax Lemma 3 tells us that (1 ^ — A)^^ > 0; hence 
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f{£) > 0. Furthermore, f{x) is a decreasing function of x on this interval. Next we use aX„ 
express the matrix inverse as a convergent Taylor series. 



< 1 to 



a Ay 



CX3 

fe=0 



(?)' 



A^ 



(20) 



Each term is nonnegative, {A'^)ij > 0. Next we use the bound A < C, where C is the 'constant' matrix. It 



has the special properties C — n 
us to bound the inverse as follows. 



'^C (for fc > 1) and e^C = ne^ . This gives A'' < n^ ^C and allows 



a - aAy'^ < £-^ 



^ t 



fc=i 

a/£ 



1 + C 



1 — an/ 1 



(21) 



Using this bound, and e^s < n and e^Cs < n^, we can bound f{n) as f{n) < 1. 

Next we investigate the function f{£) in the hmit £ I aXmax- The matrix {£1 — aA)^^ has only nonn ega- 

tive components, and its component {£ — aAmax)~^VmaxVj[^ax blows up. We have seen in Section 4.2 



that Vmax > 0. Furthermore, we have s > and s ^ 0. Hence v^^^x^ > 0- We conclude that 

lim^^QA^,, /{£) = oo. 

From all the above it follows that f{£) on the interval (aAmaxi n) is a decreasing function spanning at least 

the whole range [1, oo), and hence has to intersect the value 1 for some £. This proves the existence of a 

solution £* e (aAmax, n] of ( 13 1, which implies that r = u(^*) is a solution of (jsl). 

Finally we prove that this solution satisfies r e [0, 1]". From Lemmaplwe know that {l£^,/a — A)"^ > 0. 

Substitution into (J9]l and using s > gives r > 0. From LemmafTlit then follows that r e [0, 1]". D 

Remark: We have restricted ourselves to irreducible A in Def.[T] However, if A is reducible then in almost 

all cases Theoremfllstill holds. For reducible nonnegative A the Perron-Frobenius theorem gives v^ax > 

instead of v^ax > 0. The proof above hinges on vj^^ax^ > 0- This condition is satisfied as long as s is not 

perpendicular to Vmax- For instance, if s > then automatically vJJ^ax^ > 0- Furthermore, for s > and 

randomly generated A, the probability of the event vj^ax^ = is negligible. 

Proof of Corollary [T 

In the limit a — > 0, ( 13 i directly gives £ — > e^s and ^ gives r — > s, as expected. The limit a — > 1 is less 

straightforward. Let us write the decomposition of s into eigenvectors of A as s = J^i di'Vi. Then, (13 1 is 



solved by £^, 



aK 



(1 - a)eTv: 



niax^inax 



(which has the correct limit £^ 



Xr. 



x). Substituting £^ 
into (J9]l precisely yields (12 1. D 

Proof of Theorem ID 

From the fact that f{£) is monotonically decreasing, it follows that the £^, given by Theorem fl] is the only 
solution of f{£) = 1 on the interval £ > aAmax- Next we consider solutions £' on the interval (0, aAmax)- 
In order for u{i') to be nonnegative, it has to satisfy a^u(^') > for all a > 0. If we can find a counter- 
example then we know that u(^') is not nonnegative. One counterexample is a = v,„ax- From ^ we have 



.u 



0-(l-a)(vLxS)/(l 
Proof of Theorem |3] 
Eq. (Bl) can be written as [r — (1 - 
is given by £6r + [r — (1 — a)s]e 
and 6 A, then using r — (1 — a)s 



iAn 



£.'), which is negative for £' < a An 



D 



a)s]£ — aAr. The first order part of this equation (linear in SA and 5r) 
^Sr = aSAr + aASr. Gathering together all the terms multiplying 6r 
{a/£)Ar, and finally isolating Sr, we get 



^r = a £1 — aA 



-Are' 



SAr. 



In index notation it reads 



Sr^ = 



U' 

zy 



lA 



.Ave' 



(SA) 



zy I y 



(22) 
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We also know from elementary differential calculus that Sr^ = ^^ Zm^ {^^)zy D 

Proof of ri = (1 + |^)ro 

Let us modify Aio A' = A + Q'i., with C, G [0, 1]. The solution of (J5]l using A' will be denoted as r^. 
Thus we have r^ = (1 — a)s + aj4'r^/e"'"r^. Next we try if there is a solution of the form r^; — fcro, 
where k is some constant. This yields fcro = (1 — a)s + a{A + Cl)ro/^o- We use (J5]l to replace the 
expression (1 — a)s + aAYQ/t^ by Yq. This yields (1 + aC^/i. — k)ro = 0. Since Tq 7^ we conclude that 
fc = 1 + a(/io. Theoreml2]guarantees that the found solution r^ = fcro is unique. D 
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